36 research outputs found
Three Dimensional Tissue Motion Analysis from Tagged Magnetic Resonance Imaging
Motion estimation of soft tissues during organ deformation has been an important topic in medical imaging studies. Its application involves a variety of internal and external organs including the heart, the lung, the brain, and the tongue. Tagged magnetic resonance imaging has been used for decades to observe and quantify motion and strain of deforming tissues. It places temporary noninvasive markers—so called "tags"—in the tissue of interest that deform together with the tissue during motion, producing images that carry motion information in the deformed tagged patterns. These images can later be processed using phase-extraction algorithms to achieve motion estimation and strain computation.
In this dissertation, we study three-dimensional (3D) motion estimation and analysis using tagged magnetic resonance images with applications focused on speech studies and traumatic brain injury modeling. Novel algorithms are developed to assist tagged motion analysis. Firstly, a pipeline of methods—TMAP—is proposed to compute 3D motion from tagged and cine images of the tongue during speech. TMAP produces an estimation of motion along with a multi-subject analysis of motion pattern differences between healthy control subjects and post-glossectomy patients. Secondly, an enhanced 3D motion estimation algorithm—E-IDEA—is proposed. E-IDEA tackles the incompressible motion both on the internal tissue region and the tissue boundaries, reducing the boundary errors and yielding a motion estimate that is more accurate overall. Thirdly, a novel 3D motion estimation algorithm—PVIRA—is developed. Based on image registration and tracking, PVIRA is a faster and more robust method that performs phase extraction in a novel way. Lastly, a method to reveal muscles' activity using strain in the line of action of muscle fiber directions is presented. It is a first step toward relating motion production with individual muscles and provides a new tool for future clinical and scientific use
Memory Consistent Unsupervised Off-the-Shelf Model Adaptation for Source-Relaxed Medical Image Segmentation
Unsupervised domain adaptation (UDA) has been a vital protocol for migrating
information learned from a labeled source domain to facilitate the
implementation in an unlabeled heterogeneous target domain. Although UDA is
typically jointly trained on data from both domains, accessing the labeled
source domain data is often restricted, due to concerns over patient data
privacy or intellectual property. To sidestep this, we propose "off-the-shelf
(OS)" UDA (OSUDA), aimed at image segmentation, by adapting an OS segmentor
trained in a source domain to a target domain, in the absence of source domain
data in adaptation. Toward this goal, we aim to develop a novel batch-wise
normalization (BN) statistics adaptation framework. In particular, we gradually
adapt the domain-specific low-order BN statistics, e.g., mean and variance,
through an exponential momentum decay strategy, while explicitly enforcing the
consistency of the domain shareable high-order BN statistics, e.g., scaling and
shifting factors, via our optimization objective. We also adaptively quantify
the channel-wise transferability to gauge the importance of each channel, via
both low-order statistics divergence and a scaling factor.~Furthermore, we
incorporate unsupervised self-entropy minimization into our framework to boost
performance alongside a novel queued, memory-consistent self-training strategy
to utilize the reliable pseudo label for stable and efficient unsupervised
adaptation. We evaluated our OSUDA-based framework on both cross-modality and
cross-subtype brain tumor segmentation and cardiac MR to CT segmentation tasks.
Our experimental results showed that our memory consistent OSUDA performs
better than existing source-relaxed UDA methods and yields similar performance
to UDA methods with source data.Comment: Published in Medical Image Analysis (extension of MICCAI paper
Synthesizing audio from tongue motion during speech using tagged MRI via transformer
Investigating the relationship between internal tissue point motion of the
tongue and oropharyngeal muscle deformation measured from tagged MRI and
intelligible speech can aid in advancing speech motor control theories and
developing novel treatment methods for speech related-disorders. However,
elucidating the relationship between these two sources of information is
challenging, due in part to the disparity in data structure between
spatiotemporal motion fields (i.e., 4D motion fields) and one-dimensional audio
waveforms. In this work, we present an efficient encoder-decoder translation
network for exploring the predictive information inherent in 4D motion fields
via 2D spectrograms as a surrogate of the audio data. Specifically, our encoder
is based on 3D convolutional spatial modeling and transformer-based temporal
modeling. The extracted features are processed by an asymmetric 2D convolution
decoder to generate spectrograms that correspond to 4D motion fields.
Furthermore, we incorporate a generative adversarial training approach into our
framework to further improve synthesis quality on our generated spectrograms.
We experiment on 63 paired motion field sequences and speech waveforms,
demonstrating that our framework enables the generation of clear audio
waveforms from a sequence of motion fields. Thus, our framework has the
potential to improve our understanding of the relationship between these two
modalities and inform the development of treatments for speech disorders.Comment: SPIE Medical Imaging: Deep Dive Ora
Successive Subspace Learning for Cardiac Disease Classification with Two-phase Deformation Fields from Cine MRI
Cardiac cine magnetic resonance imaging (MRI) has been used to characterize
cardiovascular diseases (CVD), often providing a noninvasive phenotyping
tool.~While recently flourished deep learning based approaches using cine MRI
yield accurate characterization results, the performance is often degraded by
small training samples. In addition, many deep learning models are deemed a
``black box," for which models remain largely elusive in how models yield a
prediction and how reliable they are. To alleviate this, this work proposes a
lightweight successive subspace learning (SSL) framework for CVD
classification, based on an interpretable feedforward design, in conjunction
with a cardiac atlas. Specifically, our hierarchical SSL model is based on (i)
neighborhood voxel expansion, (ii) unsupervised subspace approximation, (iii)
supervised regression, and (iv) multi-level feature integration. In addition,
using two-phase 3D deformation fields, including end-diastolic and end-systolic
phases, derived between the atlas and individual subjects as input offers
objective means of assessing CVD, even with small training samples. We evaluate
our framework on the ACDC2017 database, comprising one healthy group and four
disease groups. Compared with 3D CNN-based approaches, our framework achieves
superior classification performance with 140 fewer parameters, which
supports its potential value in clinical use.Comment: ISBI 202
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer
The tongue's intricate 3D structure, comprising localized functional units,
plays a crucial role in the production of speech. When measured using tagged
MRI, these functional units exhibit cohesive displacements and derived
quantities that facilitate the complex process of speech production.
Non-negative matrix factorization-based approaches have been shown to estimate
the functional units through motion features, yielding a set of building blocks
and a corresponding weighting map. Investigating the link between weighting
maps and speech acoustics can offer significant insights into the intricate
process of speech production. To this end, in this work, we utilize
two-dimensional spectrograms as a proxy representation, and develop an
end-to-end deep learning framework for translating weighting maps to their
corresponding audio waveforms. Our proposed plastic light transformer (PLT)
framework is based on directional product relative position bias and
single-level spatial pyramid pooling, thus enabling flexible processing of
weighting maps with variable size to fixed-size spectrograms, without input
information loss or dimension expansion. Additionally, our PLT framework
efficiently models the global correlation of wide matrix input. To improve the
realism of our generated spectrograms with relatively limited training samples,
we apply pair-wise utterance consistency with Maximum Mean Discrepancy
constraint and adversarial training. Experimental results on a dataset of 29
subjects speaking two utterances demonstrated that our framework is able to
synthesize speech audio waveforms from weighting maps, outperforming
conventional convolution and transformer models.Comment: MICCAI 2023 (Oral presentation
DRIMET: Deep Registration for 3D Incompressible Motion Estimation in Tagged-MRI with Application to the Tongue
Tagged magnetic resonance imaging (MRI) has been used for decades to observe
and quantify the detailed motion of deforming tissue. However, this technique
faces several challenges such as tag fading, large motion, long computation
times, and difficulties in obtaining diffeomorphic incompressible flow fields.
To address these issues, this paper presents a novel unsupervised phase-based
3D motion estimation technique for tagged MRI. We introduce two key
innovations. First, we apply a sinusoidal transformation to the harmonic phase
input, which enables end-to-end training and avoids the need for phase
interpolation. Second, we propose a Jacobian determinant-based learning
objective to encourage incompressible flow fields for deforming biological
tissues. Our method efficiently estimates 3D motion fields that are accurate,
dense, and approximately diffeomorphic and incompressible. The efficacy of the
method is assessed using human tongue motion during speech, and includes both
healthy controls and patients that have undergone glossectomy. We show that the
method outperforms existing approaches, and also exhibits improvements in
speed, robustness to tag fading, and large tongue motion.Comment: Accepted to MIDL 2023 (full paper
Pig Heart Data
The data set includes 41 MR image short-axis slices of a pig heart in-vivo. Six raters labeled the left and right ventricles as well as the two RV insertion points, yielding 246 labeled images. The cine MR data and labeled images are published